ISO 8859-1 character set overview
The HTML specifications state that HTML uses the ISO 8859-1 (Latin 1)
character set for the encoding of documents. If you want to send out
an HTML document and ensure everyone will be able to read it as you
intended, it must be in this character set. If the protocol you use is
not fully 8-bit, for example e-mail, a post to Usenet or FTP in "ascii"
mode, then you should not use the characters above 127 directly, but
instead in escaped form.
(Of course, the above does not apply if you are writing for a specific
group of users, or need another character set for your language).
The following tables give all characters which are available in the
ISO Latin 1 character set. In each table, you will see four columns:
- Char. This is the actual character.
- Code. This is the decimal code number for the character.
- Name. This is the entity name for the character.
- Description. A short description on the character.
In all cases, you may use the decimal code number to represent the
character, or the entity name if that's available. A number is used
like this: © to represent the 169th character. Since
this character also has a name, you can also use ©
to represent it.
The table with characters uses a small GIF image for each character.
This means you need to load up to 32 images per table. A faster way
is probably to download the screenshot for the table, and use that
as a reference.
A GIF image with the complete overview is
also available (1143x1530 pixels, 70K).
Notes
ISO-8859-1 explicitly does not define displayable characters for
positions 0-31 and 127-159, and the HTML standard does not allow
those to be used for displayable characters. The only characters in
this range that are used are 9, 10 and 13, which are tab, newline and carriage
return respectively. If you
attempt to display these invalid characters on your own system, you may find
some characters displayed there, but please do not assume that other
users will see the same thing (or even anything at all) on their
systems.
Although the specs require that all browsers support this character set,
not all actually do. In particular, Macintosh browsers display the
following 14 characters incorrectly: the broken vertical bar (¦), superscript 1 (¹),
2 (²)
and 3 (³),
quarter (¼),
half (½),
three quarters (¾),
uppercase (Ð)
and lowercase eth (ð),
uppercase (Þ)
and lowercase thorn (þ),
uppercase (Ý)
and lowercase y acute (ý)
and the multiplication sign (×).
Macintosh users might want to install Profont, a
monospaced font that displays all entities correctly. Alan Flavell
maintains a more
extensive dicussion of this topic.
In most cases, you will not need to use the " entity for the
double quote ("). It might come in handy if you need it inside a
quoted attribute value, for example as in ALT="My "new"
site"
.
Reference index ~
Character set index ~
Feedback
Copyright © 1996
Arnoud "Galactus"
Engelfriet.